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ABSTRACT 


A test of the hypothesis that the second parameters 
("success" probabilities) of a number of paired binomial 
distributions are pairwise equal is derived under weak 
assumptions, Computer codes necessary to implement the 
procedure are given and a case study is used to demonstrate 
the procedure, Some other procedures for testing the same 
hypothesis under stronger assumptions are discussed and 
compared with the given procedure. <A rapid approximate 


procedure is also given. 
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CHAPTER I 
INTRODUCTION 


In certain experimental situations one is interested 
in the hypothesis of pairwise equality of means in (say) 
n pairs of binomial distributions, but has no interest in 
what that mean (for a given pair) is. Specifically, if we 
denote the "success probability" parameter of the first 
member of the i%f pair by d, and the same parameter of the 


second member by f then it is desired to test the 


a 
hypothesis 


H ; | = f i. eee ue e©o00 9 nl 


against various alternate hypotheses which are left un- 


specified at this point. Note that the relationships (if 


4 J 


of no concern, nor are the numerical values of a. and bee 


A practical example of the need for this type hypo- 


any) between d, and d, or between f. and for i; j, are 


thesis test arises in the examination of a detection systen. 
A basic requirement of a detection system is that it should 
perform better than a "random" system, which is defined as 
one in which d, the probability of detecting a target 

(given that one is actually present) is equal to f, the 
probability of giving a false alarm. A system may be con- 


sidered better than random if dof. 


suppose we have a system which may respond differently 


under different operators. If d. and f, are the detection 


4 
and false alarm probabilities, respectively, for the i%h 


operator, we are naturally interested in testing 


Heed = f i) =] Ize... we against, say 


ae 
fQu 
Vv 
ry 
pte 
puto 
il 


1 gon ©o0 9 Il-e 


In some situations, one might require a two sided test or 
perhaps a less stringent alternate hypothesis. 

In any case, the matter of interest is whether the 
system performs better than "random" for each operator, 
without regard to possible differences between operators. 
For the remainder of this paper, the detection system 
example will serve as a "prototype" case. It is hoped that 
this will result in increased intuitive appeal and clarity 
of presentation. 

A typical experiment conducted to facilitate per- 
formance of the desired hypothesis test involves n operators, 


th 


the i of whom makes r, detection attempts with targets 


i 


actually present and s, attempts with no targets present. 


1 
The data then consists of n pairs of observations so casey! 
where Jy is a realization of the random variable ae the 
number of true detections made by the ach operator from 
attempts with targets present, and k, is a 


i ay 
realization of the random variable kK the number of false 


among the r 


ee 


alarms registered from among the s, attempts made without 


ak 


1OQ 


th 


targets present, We assume the i operator makes true 


detections with constant probability d and generates 


a ed 
false alarms with constant probability I. 
binomially distributed with "success" probability parameter 


Thus J, is 


d, and number of trials parameter ry (hereafter written 
d d 
J, = b(rysd.)). Similarly, we have K, = b(s, sf, ). 
We will assume that the experimental design and 
general conditions are such that the following assumptions 


are satisfied: 


Lig ry = S, =m Yi (made only for mathematical simplicity ) 
Cn J, g b(m,d, ), K, : b(m,f, ) Vis i=1,2, sesg@en 

5* J, and K, are independent Vas; i=1,2, 2.3 & 

4, J, and J, are independent for i # 3 

oe K, and K, are independent for i # j 


SOME EARLIER APPROACHES 

Before proceeding with our development of the desired 
hypothesis test, we briefly discuss two procedures which 
have been used in the past. Since both of these procedures 
require assumptions stronger than those listed above, they 
are of restricted applicability. 
The Equal Performance Assumption. If the additional assump- 
tion that a =@ Vieand ft =f Wi, then the experimental 
data may be summed into a single pair of observations 
[i z Js and k' = 2 ky wnere j*‘ and k* are, respectively, 
realizations of J' and K', where J' dg b(nmm,d) and 


K' 2 b(nm,f). 


JLab 


The natural hypotheses under test are then 


Hy 3 Gi Best ieee foe 


1 


Which may readily be tested by the exact method discussed 
below! or by an appropriate asymtotic test*. 

A hypothetical example serves to illustrate the 
dangers of improper application of the foregoing procedure. 
Suppose that n/2z of the observed pairs were (m,0), while 
the other n/2 were (0O,m). Then j' = nm/2 and k* = nn/2 
and the hypothesis that d = f would be accepted. What has 
probably happened, however is that d, >f for about half 


4 
the operators while d,<f for the remainder, Summing 


1 
over all individuals has obscured information to this 
effect contained in the data. 
In an extreme case, such as the above, the experi- 
menter should recognize that the equal performance assump- 


tion does not hold, However suppose the case were less 


extreme, say that given in Table i (assume m>6). 


TABLE 1 


Results of a Hypothetical Experiment 


Number of observations Value of observation (J, 5%, ) 
n/4 (m = 2, 4) 
n/4 (m - 6, 2) 
n/e | (Seem = 3) 
n/4 (ae) 


12 


Here, testing under the assumption of equal performance 
yields j' = (n - 1)m/2 and kt = (n - 1)m/2, and again, one 
would accept the hypothesis that d = f. With this result, 
the experimenter would not be likely to suspect the equal 
pertlormance assumption, yet it is nousclear that it is 
satisfied. 
One might attempt to "validate" the assumption of equal 

performance by conducting a homogeneity *~ test on the ob- 


servation vectors (Ji sd ~~ Je and (k, sk ca; k,) er by 


Z 
goodness of fit tests” to some binomial distamiibuiaons. 
However, the homogeneity test for binomial data is a strictly 
intuitive procedure, and thus not entirely desireable; also 
the selection of which binomial distribution to test the 

data against poses a considerable problem. Further, in both 
@aees, the size of the overall procedure is difficult to 
determine. we therefore conclude this discussion by noting 
that testing for homogeneity among binomial samples is an 
area in which additional work appears to be needed. 


6 has suggested 


The Analysis of Variance Approach. Bartlett 
and Curtiss’ has shown that the arcsine transformation 

Y = arcsin /X/m where X ¢ b(m,p), imparts to Y an approximate 
normal distribution with mean arcsin ,/p and variance 1/4m, 
(Denoted Y d Nifaisc slime Dag / 4+ )c ie Usteee talc teciuique sad 
the prototype case we have Jj = arcsin VJ, /m and Ky = arcsin 
VK,/m distributed approximately normal with means arcsin 


vd, and arcsin Wf, respectively, and the same variance 1/4n. 


If we now think of having a target present as treatment D 


13 


and having no target present as treatment F, the J repre- 
sents a "response" under treatment D, while Ky represents a 


response under treatment F, we can now model these responses, 


a v i zy § = — fe & 
Say J; as Js ap + Te r €,; where 
Tp = mean (transformed) response under treatment D .- 
G. =- differential (transformed) response under treat- 
ment D for the i operator 
€,¢ = (vransformed) sampling error for the observation 


on the jhe operator under treatment D. 


Ky could be modeled analogously as 7,4 Lie Eee and we can 


assume thats 


2 
1,@. 1s asymtotically N(O,;0,°) Y = K,D3 1=1,2,...5n 


Zo &v¢ is asymtotically N(0,@°) r= KD; 1=1,25...cee 


The random factor model, two way classification Analysis of 


Variance may then be used to test 


Ho! a, = f, Yi against the somewhat restricted and 
ae 
unnatural alternate hypothesis 


© 1 / *° wo j 
Hy s oy ft ry for some i. 


A further problem with this approach is that it requires 


m "large" and d, and f, not "extreme", say 0.2<d, <0. and 
a 


i 
O.2e<f, <0.8 in order for the arcsine transformation to 
yield reasonably normaily distributed random variables. 
Since we cannot (indeed, may not want to) assure that these 


additional requirements are met, this approach is of limited 


applicability. 


14 


A MORE GHNERAL APPROACH 


The purpose of this paper is to develop a procedure 


O 
alternate hypotheses (as yet unspecified) under the four 


to test the hypothesis H.: d, = if Yi against various 


assumptions listed above, and no others. 

In Chapter 2, we develop a graphical representation of 
the observations (J,9k,); derive the likelihood ratio test 
critical region characterization, present an exact test for 


H.: ad. =f. against Hy: qd, f f. (or d 


a i 4 
individual, and explore three apparently unfeasible 


>f,) for a given 


approaches, 

Development of a general two stage test procedure is 
given in Chapter 3, and in Chapter 4 we adapt the two stage 
test procedure to the prototype situation, present a de- 
tailed case study, and make some comparisons with the earlier 
approaches, we conclude with the derivation of an approxi- 
mate, but rapid procedure which may be useful in some cases. 
Computer codes necessary to implement the two stage procedure 


are given in two appendices. 
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CHoAriah 12 
BACKGROUND AND "CONVENTIONAL" APPROACHES 


The user of "conventional" approaches to testing the 


hypothesis H,: a, = f. fi against some alternate meets with 


e 
0 
‘ | 
a] 


ifficulty due to the need to specify a value of d, (or f,) 


i 
as Will be shown below, Before doing so, however, it is 
convenient to introduce a graphical approach to the problen, 
derive the likelihncod ratio test, and consider a special 


case (n=1). 


A GRAPHICAL REPRESENTATION 

For ease of communication, it seems advantageous to 
utilize a graphical display of the data such as shown in 
Figure 1. where the observations (3, ) are normalized (by 
dividing by m) and plotted in a unit square. Intuitively, 


one feels that if d, = f. Wi, then the points (j,/a,k, /m) 


sheuld all lie near the line oF = K whereas if qd, ae Vi 


then the points (j,/m,k, /m) should tend to lie to the 


right ef the line di = f etc. Ina sense, the test of 


sae 
Hy 3 a, = f Wi against some alternate hypothesis is a test 
of "closeness" of the points (J, /msK, /m) to the line j, = k, . 
This concept will be helpful in the succeeding discussions. 


THE LIKELIHOOD RATIO TEST 
Tt would be desireable to have a likelihood ratio 


test, since the Neyman-Pearson Lemma then provides that for 


16 





Figure 1 


Graphical Representation of Data 


Simple nypotheses, the test is best of its size. Let us 


consider specifically the test of H d, = f. 1 against 


0? 
_ —_> 

Ay: qd, os f, i, Let v be the vector (Jk, sJosKoaeeesd sk). 
By independence the likelinood of v is given by 


n J k m- Jj m-k 
py m ,,M a ee a aes | 
L(¥) = ay 8 yay) (f,) (1 d,) (1 f? (i) 
Tne parameter space {bis given by 


| / 


17 


while the null space Wis given by 


ew = | (4 +f) Or em er O<f, <1, d,=f,; —s ee 
(oe 

1% is convenient to opergvc on Ln L rather than L directly 

so we maximize 

& * 

ian Ge a {3,20 d,+k,in £, + (a4, in(1-4, ) + (eis, In(1-£, ) | 
(4) 

over the null and parameter spaces, Adopting the convention 

O-In O= O8, (Ghat is extending the’ function’ magin x to 0 by 


continuity), and maximizing yields 








max => S J,tk, . JtK, ¥§ 
uw) LnL(v) i (J+, )in( a ) + (em-j,-k, )in(1+ 5 ) 
(9 
and 
max = = : Jy Ky + )in(4 dy 
fe Lie) wa (j,ln(—-)+k, In(>)+ (ai-J, ) n(i- — ) + 
Ky _ 
(amie, a = 4) f ; (6) 


The logarithu of the likelihood ratio (¥) = [2 L(v)/AA*L(¥) 


is given by 





* up to additive constant terms 
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A n Jitky Jat *, 
Ln A(v) = 2, J, +k, )ln(—3——) “+ (2m-J,~-k, )ln(1- = ) 
i 





| Jy Ky Ji nee 
- jin (=) - kyln(—) - (m-j,)1n(1 - =) - (n-k, yin(1- = yf 
(7) 
and we may characterize the test by rejecting when Ln Ver) <C 
(constant). The boundary .of the critical region is then the 
hypersurface in (2n+1) dimensions defined by Ln X(¥) = C, 
which is difficult to visualize. However, by the indepen- 
dence among pairs of observations, contour lines of Ln X(T) 
in any of the n planes defined by j, = 0, k, = Os nee |e tor 
Me 1,2,.66,0, Will be identical{wand as shown invhigurer-. 
It may be noted that Ln X(7) could be quickly evaluated by 
plotting the points (j,/m,k,/m), i= 192¢eoeeonsrieure 2, 
and summing the values of the contour lines on which they 
tall, This reinforces our original intuitive notion of 


rejecting H, if too many points fall "far" from the diagonal 


0 


17 = - 


For a one sided test, say Hy: ds -T. Wi, the likeli- 
hood contour line plot corresponding to Figure 2 is shown on 
Figure 3. 

It appears we are now ready to complete the procedure. 


We need only find C such that P[Ln X(V¥)<C] = where dis the 


* up to additive constant terms 


1? 


1.0 


ome) 
O20 J, Mm 1.0 


Figure 2 


B 


Contour lines of Ln (v) in the plane 


defined by ae k, = 0, i#1(two sided test) 


1 
uesired size of tie test. But here’ is wh@€re the difivetic, 

_ cd 
srises, Tie conventional approach to finding Plin »(V) <2] 
OULO Velo Weve ee Ia 


= ie x7) = Cyl — elk = a where (8) 


ay [SA 
q 


Ss 
Ves fs; Lear) = on | , ANC Omen [" -{y; Cy < o} we Nake 


- 7 Br: al —_ 
rin A(V)<c]= ¢ cf PLV=vV,]. (9) 
ve, sel, 
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, /a 


0.0 J,/a 1.0 


Figure 3 
Contour lines of In (v) in the plane 


defined by J, =0> k, =0, 1# 1 (one sided test) 


1 
Both M and J~ are readily found for all C and w~, but the 
cdl 
probability that V 2, cannot be specified, for assumption 
2 S b(m,f,) with 4, and 


a 1 Ht 
unspecified. This inability to complete the relationship 


2 states only that J b(m,d, ) and k 


fi 
a 
between the size of the procedure and the critical value C 


renders the "conventional" approach useless. 


SOME "DIRECT" ATTEMPTS TO FORMULATE TESTS 
Three comparatively straightforward attempts to circun- 


vent the lack of specific distributions for Jy and K, were 


unde. A brief sumaary of each with an explanation of why it 
teased Tollons, 
The first attempt involved the transformation 
ay = os - +)/%, srich 1s a projection of tne points 
(Jj, /m,k/a) onto an axis orthogonal to the line J, =k 


y aL 


/ 
fo 


4 u(i/2, 3/4) = {1/32 


/ / 
u(.6, .4)=-¥2A0 


Y, | » u(1, .8)= -¥27/10 
ys 
fa 

ZB : 

J 


Figure 4 


» as 


i 


wr 


The Angular Transformation u(Jj,/a,k, /m) 


Ze 


jllustrated in Figure 4 For example, the point A: (1/2,3/4) 
is transformed into u(1/2,3/4) = iyo) Intuitively, one 
would nope to find a good procedure which rejects Ho if too 
many of the Us fall close to the end points of the interval 
( - /i/2; Wt /28. The details of why such a procedure is not 
good are burdensome, but the cause is easily shown graphic- 
ally by the points B and Din Figure 4, Note that both B 
and D map to the same point E on the U axis, yet B represents 
a Ln A value of - 1, while D falls on the LnA = - 4 contour. 
Thus if C, the critical value, was - 2, say, point B should 
cause acceptance of Hy While point_D shoulda cause tejection, 
yet given only point E on the U axis, we cannot determine 
whether it is the image of B or D, hence this approach 
fails. (More complicated transformations along the same 
general lines were also attempted, but without success. ) 

The second attempt involved use of the sums Edy and 
. Ky as test statistics, Again, the details of tne, pe 
ae, are burdensome, and a graphical argument demonstrates 
the problem quite clearly. As an example, we shall consider 
two different experiments, each with n=2, with results 
plotted on Figure 5. The points A and B are from the first 
experiment, the points A’ and B' from the second. The sums 


n n 
xj, and & k, are the same for each experiment (1.1 in 


1 
j=1 il, 
all cases), yet clearly, the first experiment (points A and B) 


should tend more toward rejection than the second (points A! 
n n 
and B'). In short, x: j, and 2 k, are not sufficient 


dhl 1 pil 1 
Seavistics for d, and f,s T= Ly Seether a Mile 
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(.5,-6) eo At 





0.0 J, /a 


Figure 5 


Two divergent experiments with equal data sums 


Discussion of the tnird "direct" attempt is best 
d:ferred until tne end of the next section, Wnich establisnes 


some necessary background and notation. 


B SESCTALACES tela 1 





Maen only one operator is used in Bie experiment suse 
orocedure known as Fisher's Exact Test for Fercentages is 
1 | : 
azplicable . <A brief derivation follows. The test of 


interest is now 


i gaa , against (say) i: dst 


0 1 J i 
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Now under H and by independence 


Or 
PLI, = 5, 5K, =, J = PJ, = 3,1 PLK, = k, J 
jg + K eém=- j, -k 
mim! da, ~ + (1-4,) tot 


> eae ee pi 


But PLJ, = Jy » K, = KJ is equivalent to PLJ, = Jyo4, “+ kK = j,+k J. 


At this point it is convenient to introduce the random 


variable Li = J +K, so that we have PLJ,=4,>K,] equivalent 


] 


(ne PLS, = J, +L, =1,J where 1=j,+k Equation 10 may now be 


es 
written 


e 4 = a 
mimid, (2 - d,) 


2 (11) 


oa ih Jy i€1y- 5, )'tm~ Jy) i (meg, 1)! 


~- 


and we note that we must have Os J, <m, O< j, <i, <a. Now 
feeill under Ho): the random variable L, is the sum of two 
independent identically distributed binomial random variables 


ana so is binomially distributed itself, Specifically, 


d 
L, = b(2m,d, ), So 
1, en-1, 
: (2m)! d, (1 = dy, ) (12) 
= 2 a ee ee il 
PLL, =1,/ = (2m-1,)1 1,% 


By the definition of conditional probabilities 


PLJy = Jy sty =1, 1 


PJ, = 34 | L,=1,] = ptt, =1,) ° (abs) ) 
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Pubstrouuane (11) into (12 wand (13 )\.yvields 

al emn-1 
dj (1-d, ) 

Jy $424 - 34 )d (m= Jy) ttm+ 3, -11)! 
em- 1, 


i 1 


mimid 


iI 


PLS, = J,/L,=1,] Ty 

(2m) itd, (1 ~ a, ) 
(em -1,)! 1,! 

ee ee 
(2m) 19, 1(1, - 5, )t(m- 5, )'(m+ 5, -1,)! 


Rearranging appropriately yields 


W) Woe. 
plJ, =3, | L,=1,] = + (15) 
V) 


which is the hypergeometric distribution with first parameter 
(population size) 2m, second parameter (sample size) 14s and 
third parameter (proportion) 1/2. 

de now have the distribution of tne number of true 
detections J, given 1, total indicated detections, and most 
and f 


importamtily, it is indeévendent of da Let us now 


1 a 
develop a Jikebinood valiowwest or Hy d,=?f, against some 
as yet unspecified alternate hypothesis. Letting (0, |e 
senane a L(3, (2, Ae mana he 8 where w= ba,e [0.1]; a, = ty}, 
fb = fa, 50 <d, <1f, and L( 3, fa.) is the conditional likelinood 


of J4 given 1, we have 


)= - a (16) 
1 PLJ, = J), L, = Wea | 


M5, /2 


Now under Hy 


rs n- j m+j,-1 
Mee ee ey AEE 


PCJ, = Jy5b, = 1, | 4 


(17) 
and 
1,=-a m=], +o 
ih 
mimid. (1- d,)7¢ ef > aneeen 
dt ii i 
P[L, =1,/4,]= Dak a! (m-a ) 1(1,-a)!(m-1,+a)! oo), 
fWaere a= max fo ; 1,- } and D=idin {x 1} : 
Substituting (17) and (18) into (13) we have 
1/4 9,'(1,<d,)t(m-9, )t(m-1,+),)! 
P(J, = 4J4| L,=1,>8,] = Al a ri] 
ele J4 
f, (1-d,) 
» a (19) 
a=a at(m—a)!(1,-a0)!(m-1,-+0)! 
And substituting (14) and (19) into (15), we obtain 
Jy 
. aa | 
mim!(2m-1,)114! f, (i-d, ) 
ee a as | (da @) 
51/11) = (2m)! >. a!(m-a%)!(1,-a)!(m-1,+a)! (20) 


O=a 


Let us now consider a specific alternate hypothesis, say 


reed. >f We then h ome 
12 4, >f,- e then have F(isdy) 


tonically decreasing in Jz. Since we would reject Ho if the 


>i and (5,124) is mono- 


observed test statistic A(5,]2,) is sufficiently small, say 


less than some critical value C, we can perform an equivalent 
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test by rejecting Ho if the observed value of J4 is @reater 
than ¢', where J4 >C' is equivalent to (3, 1,)<c. we 
note also that this test is uniformly most powerful for 
Hy Since the procedure would be the same for all d, >f,. 
The size of tne procedure is P[J, >C'|1, Hp], wnich can be 
readily computed using equation 15, and the power ata 
spec i fiicmposamt (d,.f) can be computed using equation 19. 
For a two sided test, i.e. Hy: a. of fs: a Similar 
development shows that the optimum rejection region is 


characterized by the rule "reject Ho when J, or Jy 7m- A" 


where A is such that 
PLJ, <All, 8 ]+PLy, >m- all, so = 2p[J, < All, Hg] = O 


(Of course, not all values of ae€[0,1] are available. we 
assume that a is chosen from among the realizable values. 
Randomization to achieve arbitrary sizes may be used but we 
shall omit the details. To include such a consideration here 
would only add complication without adding substantive 

imc ormatuon. ) 

It now seems natural to extend the above result to 
cases where n>1. This was the third "direct" attempt 
alluded to in the previous section. Specifically, it was 
hoped that the families of probability functions 

- 
Padi "4 | it 
and used in a fashion analogous to the preceeding. However, 


n n 7 
1H, ] andPLe J, | 2£1,,H,] could be obtained 


whereas the function PLJ, 2, Hy] was independent of d, and 


ae 


n n 
£ such is not the case for Pj] J, > 1.,H_]. The need 
izt “li-1 1 0 


ee 


to specify d. and fs. L=1,2,c.65%, Widen 1s 2oueo nr 


missible under our null hypothesis, forced the termination 


of this approach. 
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CHAPTER III 
THRE GENERAL TWO STAGE TEST 


Wnile directed at the prototype experiment, the follow- 
ing discussion has somewnat broader applicability than to 
only the specific situation discussed heretofore in this 
paper. For tnat reason, we will adopt a slightly more 
general notation for tnis chapter only. w#e will consider a 


test of the hypotheses 


dy: oa = o,, 71 against 


Bae | Onan 89, |= 4, Vi 


and suppose tnat we have available tests qT, arent EF of tite 


respective nypotheses 


4..: 9 9 against 


Oi. ai slit 


O., = 9 = Q. 


Tat | Sey ai i 


Por ¥acnh 1: t=1,2,...«,n. We shall déMete the size and 


DOWed sor, T, by a and Ts» Resbeewive.y . 


It seems reasonable that the outcomes of eacn of the 
tests qT; thru Te snould contain some information about Hoy 
and Hy Je will develop a procedure by whicn the results 
of T, thru aA (nereafter called the first stage tests) can 
be tested in order to make inferences about Ho and H,- 
This test on tne results of Ty ejaet T Wilt be called cUne 
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second stage test, and the complete procedure (applying the 
first and second stage tests) will be called a two stage 
test. we begin With a special case before developing the 


mene ra Improcedure:. 


IDENTICAL FIRST STAGE TESTS 
Suppose that H, is of the form H,: | ee One | ee Wi. 
Then the respective alternate hypotheses for the first stage 


tests will be H,,: |), - © -f\ for each test T 


an di | in 
|, owe sie NOW 1f qT) Care Ie are identical, that is 
have the same operating characteristic curve, and we apply 
each at the same size (a,=a Wi), they each provide the 
same power (7, (A) = 1(A) Wi). Now let us define the 


random variables Ky sXo90009%, by 


aig qT, results in acceptance of Hoy 
x, = 
1 if qT; results in rejection of Hoi 
n 
aver H, all of the H are true, soif Y= YX, then 
0 O1 ra 


ug g o(a,a)s olmilarlLy if HH, is tree; abn allieot vier 


i} iat 
are true and here y % b(n,mt(4)). Let us denote the (general) 


distribution of Yas b(n,p). Then if we test 


H': p=a against H!': p = n(A) 


0 ill 
we have actually tested Hy against a, Since Hy is true if 
and only if the Hos are true for all i and Ho is also true 


1f and only if all the H are true. In similar fashion 


Oi 


4 impiles H. and conversely. 


il 
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Using the well known likelihood ratio test of 
HA 3 p = K& against Hy p =77(4), the procedure will be to 
reject Ho if the observed value of Y exceeds a critical 
value y' where 


ne 


Nn 
ply y'|[atl= 2 _y (f)as(1-a) at (1) 
k i, 


=(y']"+ 
ee 
and the symbol [-] denotes the greatest integer function. 


The test of Hé against H' is, of course, the second stage 


j! 
test. Now since we can substitute H, for H' in the left 


0 0 
hand expression of equation (1), we see that a' is the size 
of the two stage procedure, Likewise, the power of the 
second stage test is given by 


< 


ply>y' | at] = (B) mia) (1 -m(a)) Sant (2) 


z 
k=[y ] “41 


and here too we can replace Hj by H,, so tT’ is the power of 
the two stage procedure, It is interesting that a! (the 
size of the two stage test) may be set independently of a, 
the common size of the first stage tests. 

An examination of equation 2 suggests that the nota- 
tion Tm’ = T1'(A, at, 1(Q,a)) should be used to emphasize 
the functional dependencies involved. We will write the 
equivalent form m' = m7'(4, a', a). Thus, while a' may be 
set independently of a, the choice of both a and a’ in- 


fluences mm. Generally speaking, we will be given (or be 


willing to specify) Aand K' so our concern would be to 


a0 


cnoose an a that maximizes n'(A, a', a). This requires 


solution of the progran 


; n k n-k 
Maximize m'= >, (k) In6450)) [1 =meta) J 
K=[y' ]°+1 
” k i —ee 
Subject to: ys () a w( lesa) 2c eK Ge fl . 
k=[y' ] +1 


If the appropriate OC Curves for the second stage test are 
available (that is if sufficiently extensive binomial tables 
are available), a simple search procedure may be used to 
find the solution. Otherwise it appears that some iterative 
technique will be required. In any case we will not consider 
this problem further in this paper, except as it applies to 
the prototype case (Chapter 4). 

Two: further observations are appropriate. First, the 
two stage test may be one sided if one sided first stage 
tests are available. Similarly, either or both of H. and 


0 


a, can be made composite provided the corresponding H 


and/or H 


O1 

ig] can be made composite and with a slight modifi- 
cation in the definition of power, The derivation in such 
cases requires only minor modification of the foregoing. 
Second, there is no assurance that the two stage test repre- 
sents a uniformly most powerful test, even when the first 
and second stage tests are in themselves uniformly most 
powerful. It appears that, in general, the two stage test 


may not be most powerful, although its power characteristics 


seem to be good. we return to this question in Chapter 4. 
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GEN@ZRAL FIRST STAGE TESTS 
we now relax the requirement that the first stage 


tests T Tooeee9T, be identical and assume only that a, and 


Lg 
M,(4,) are specified for each T,, i=1,2,...,n. we will con- 
Sider testing the hypothesis 


HAs 9 


0? : | 0,, -@ 


a, 72 against H ri ai | 


=b, Vi 


ay = 1 


for which the appropriate first stage tests are, for each i, 


Q = £ 


Hes = inst Ho: = 
Q 9 agains Hay re O55 | “ 


Oi fi ai 


If some of the first stage tests are identical, we will 
group them together, so we will have (say) k groups of 
identical tests (that is, identical within a group) to 
consider. (For example, if no two tests Ths Ts i#j are 
identical, then k=n, Whereas aieablimtests ame identical 
k= ee: B, denote the number of (identical) tests in 
the a BEOUD GS | = lscseee sks Gnd retavel vite ee and le 
so that oe and i represent the common size and power, 
respectively, of the be tests in group j. Now using argu- 


ments identical to those in the previous section, we have as 


an equivalent test of Hy against Hi» the second stage test 
Eee > 3 inst H!s > — 
ome Po == ae? ins o- © 


Here Dy denotes the k component vector of success prob- 
ability parameters in the joint distribution of 


Y= (Y,, 


1 Y recent, ) where Y, is the number of rejections 


fa 
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Cont qs) from the tests in group j. (Thus Vie {0rt,--- 58,4.) 
The vectors a and 7 denote (2.4 Goseeesd, | and (ToT seee aT), 
respectively. Now if a’ is the size of the second stage 
test, we have P [rejection of Ho | #5 | ee [ rejection of 
HS | | so a*' is the size of the two stage procedure as 
Well. Similarly, the power of the second stage test mr’, 
is P [rejection of 4§ | at | whieh is equivalent to P[ 
rejection of Hel Basle so tT’ is the power of the two stage 
procedure also. 
Let us now get the second stage procedure. By inde- 


pendence 


k 
i J 
j=. 
Under H', = is the number of "successes" in p trials at 
ON oe j= Ue hs 6 oie 
BAR oT is es LAS ole 


constant probability ae ong Y; 


seumilarly under Hi» we have = 


Ie {les 


Geen, by “quation 3) under H’ 


0 
x Se g.-¥ 
_ z ry) = j J 2 Jog 


and under Hy 


il 


k (? y cea 
L(y) = PlY=y| HJ ri 3) mJ (l-t,) ) 
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Thus the likelihood ratio A(¥) = 1 (¥)/ L, (¥) is given by 


(6) 





k Jjle-a, Q; 
where the constant Cis Vf ; J eo in general we will 
j-p 4 





have We ain JV ig nenet. 


J 
ane = tee) _ 
eT ae a Wi, so A(¥) is monotonically increasing 
i a 
as a function of any component y of ¥. Characterization 
of the optimum rejection region is pest accomplisned by 


considering 


a k 
In A(y)= 2 


. y, (ina, -In(1-a,)+1n(1-1;)-Inm,] +1nc. (7) 


1 


ve will reject H{ when In XV) is less than some critical 
value, say |. Ssance ULné quantity in the brackets [=] in 
equation 7 is negative for all j, the general tendency will 
be to reject Ho when several or all of the les ares " Wnree 7 
(near B,), an intuitively appealing result. 
Specifically, we may label all possible outcome vVectom: 

Y so that W(F,) < NFia1) la a a where v is the 
number of different y's possible. (v= z (9, +1].) Then 
the critical region consists of tne ais vectors 

a ae where r is such that 2 PLE uA HAJ a niCl are 
(We again assume a' is chosen so that it is a realizable 


i 
value.) The power n' is then 32 


PppY=y | at]. 
j=] | il 
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There are several schemes for reducing the work re- 
quired to produce the critical region (or®equivalently, the 


aceeptance region)m efhesmest Strawemeiorward is to find 


¥, and compute Pp [Y¥=¥Y,|H']. mf Pp (¥=¥,| Hh] <a", fina ¥, 
—_ om ; S a a 
and compute P [Y=¥.| Hod. If & 8B Wy | nlc aed 
0 mt i! “o 


S —m 
y, etc., until 2, P [Y=7, | Ht ] - a', at which point s=r 
aoa the critical region is defined. An identical procedure 
may be used to define the critical regen for a test of 
specified power by using the distribution of Y under Hy 
rather than HO. An even faster method (in most practical 
cases) is to start by finding y., and working "down" to the 
required size (or power) in a similar manner. 

A computer code (Appendix B) has been written which 
can complete a problem with n=20, k=5 (5 different first 
stage tests) in less than two minutes on a medium sized third 
generation computer. This program will give the acceptance 
Bee1on and power of a test of some given size or if desired, 
Will print out all or a portion of the cumulative distri- 
bipion Tunction of Y under both Ho and Hye 

In this test, as with the identical first stage tests, 
there is some optimal choice of the values for > J=1 seem 


beeen will maximize nm’ for a given sige a‘; The appropriave 


program is 


it ao 
Maximize m' = zp [Y=y,| m(a)] 
ix 
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Subject to: x PLY=/V, 
feommial 1 i ee sang Viel 


This is a challenging program indeed and we shall not 
discuss the problem here, except to state that in the proto- 
type situation it was possible to obtain powers approaching 
those available under the assumption of equal performance, 
using an ad hoc procedure (Chapter 4). 

As a final comment we note that one sided tests and 
tests with composite hypotheses may be made if appropriate 
first stage tests are available. The derivation of the 
specific two stage test for such cases can be completed ina 


manner analogous to the foregoing. 
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CHAPTER IV 


ADAPTING THE TWO STAGE TEST 


TO THE PROTOTYPE SJTUATION 


We now consider the application of the two stage test 


to our prototype testing situation. We wish to test 


for 
Il 
a) 


‘ ‘ Vi, against (say) 


Hy: d,-f,= A, Vi, 1=1,2,...5n. 


The use of the two stage procedure is facilitated by 
specification of the A's in such a manner that Fisher's 
Exact Test can be used as a first stage test. Thus we will 


specify A, so that ft 4,) is a constant for all d, and f. 


i ( a 


which © qd - - A, 

ich satisfy | t. ‘ 
Let us consider equation 19 of Chapter II, which 

may be used to compute the power of Fisher's Exact Test. 


We consider now, however, the application of Fisher's test 


th 


th 
to the i mdividual, For time i individual, equation 19 


becomes 


Ie LJ, = Jy | 4 = Li oH, J = 
; (1) 
yea Tid) TT eee rad 


where a= max |O, 1, -af, and wp = mama {n, 14. 


ey 





FIGURE 6 


Some Sample Specifications of Hay 


Let us specify 
i-f, 


H,:d,-f, = ———+_— 
re [f, - (D-1)) +1 


wnere D €(0,°%). 
A graphical interpretation of this specification is 


snown in Figure 6 for several values of D. 
Two interesting features emerge from this specification. 


First, as seen in Figure 6, this specification for H,, tends 
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to satisfy one's intuitive feelings about a reasonable form 
for His: That is, we feel that the difference d,-f,; about 
which we are concerned, can be relatively large for r. at 
midrange, but should shrink as Se approaches its extreme 

~1 
values. eit vei te d,-f, = (1-f,)/ [f, (D-1) J +1 for 


d,(1-f, ) 
si 1 
yields ————~— = D, and equation 1 may be written 
f (1-4) f, (1-4, ) 


1 


eset vt(1,-1)t(m- 7) (m+7-1, )t 


PCJ, = J5,/L,=1, 5H, 


(2) 
The power of the (first stage) test is then 


b 1 
1, = (3) 
=a vi(l,-+)t@-y)t(m+yv-1,)1 


where Jy! is such that 


m 
(3, )(2,-3,) 2 
P LJ, > 3 L,=1,:H,, = 2 a = | 
ji+a+1 = 


Should we desire a two sided test we can specify 


i-f 


di 
H a oe. 2 | oe ee) 
ee et [f, (D-1)]-? +1 
ae (th), 
solving for == in this specinicarror or Ha yields 


f (1-d_ ) 
the oie tonteD ana 1/D. It may be readily verified that 


equation 3 yields the same value of 1, when 1/D is substituted 


4} 


for D. Hence, the power of this (two sided) test is still 


given by equation 3 provided the value of J, is set by 


aC = } = . <a ae = = 
‘ P[J, >3,| 1, 7 »Ho, J+PL5, l, j,|4, 1, 5Ho,J 


1 
m m 
oy ae (5) 


=j'+a+i (, ) 
a oe 1, 


THE TWO STAGE TEST FOR THE PROTOTYPE CASE 


We now discuss our test of the hypotheses 


H:da =f Vi against 
i i 


a Seay eee ts =). 
i at -1 
a [f, (D-2)] ail 


The first stage tests ToT peoosl will be, for each 


2 
1, i1=1,2,...,N,; Fisher's Exact Test of ~The hypothéses 


1-f, 


ple AS el Seainst Haat fee 
oi i i yet 1 Ce, (p-4) P44 


_— 
Given the observed values l = (1 sloseeest) of 
me (L,sLpse++5L) (total indicated detections by each 
n 
subject) we form (in general) m groups of results by the 


rule "group Y contains all results where 1, =yor 1, =én-Y; 


1 


V=1,2,.00,n." Because Fisher's vest BH erie sane*ifor i, oe 


or 1, =2m-W, the tests to be applied within each group 


2 


are indeed identical. Continuing in the notation of Chapter 
3, the number of results (or equivalently, the number of 
first stage tests) in group VY will be denoted @, for 
v=1,2,...,m. Of course some of the 0's may be zero, and 
while the derivation of the two stage test did not take 

into account that possibility, it is easily shown that any 
number of the @..'s may be zero without affecting the pro- 
cedure or its power, It is convenient here to set k, the 
number of different tests, equal to m and allow some of the 
G's to be zero, as necessary. 

We may use the computer code of Appendix A to compute 
the available values of a, and m., that is, the possible 
non-randomized sizes and corresponding powers of the (first 
stage) tests in each group. We then choose, by a method 
discussed below, a size a+ for the tests in each group and 
apply Fisher's test at tnat size to all results in that 
smoup, for each group VY, V= 1,2,+é.,;m. The computer code 
of Appendix B may then be used to obtain a characterization 
of an acceptance region for the second stage test at the 
desired size a', Finally we form the test statistic 
Me (vy poeosl) where Y+is the number of rejections of 


isa = 
H from among the @+first stage tests applied to the 


Oi 
results in group . If ¥ is in the acceptance region, we 
accept Ho? otherwise we reject Hy and the procedure is con- 
plete, 
It is interesting that the power of the two stage test 


just described is a random variable. This is true because 


oo 


in the equation 


(2) (6) 


the values of Oy and T.,(a,.) depend upon the realization of 
i, (Or Weremopeca tealsay, tie values of 9, Y= 1,2, <ul, Ue 
We note further that the concept of expected power cannot 
be applied here, since our assumptions preclude specifica- 
tion of a distribution for T. (Indeed, suppose we did 
assume a distribution for "L. Then we could (in theory) 
derive the distributions of Js and L,-J,; Which would be 
logically equivalent to fixing values of a. and Bev violat- 
ing our assumptions. ) 

However, even when the power of the two stage test 
obtains its "poorer" values, it still has good power fon 
practical purposes, and in fact that power approaches the 
power available from tests made under the assumption of 
equal performance by all individuals. We next consider the 


choice of the best set of sizes for the first stage tests. 


CHOOSING FIRST STAGE SIZES 
As noted in Chapter 3, the "optimal" choice of first 
stage sizes Sv, Y=1,2,...,m would be given by a solution 


to the program 


Maximize 


i 
Te i 
pi) 
Ky 
T 
cap 
ah 
Ry 
| 
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> - _* 
Subject to: x& pPlY¥=y¥,|¢]=«' and 


AY, ) EN, )9 1=1,2,60.,V-1 
tt mde 
where v= 77 (2+1 ) 9% = (41,4,,0.0,% ) and 
i=l 
M(Z) = (Mm (4 )sMQ (4 )yoeey™ (2 )). 


We have been unable to solve this program in general, 
so we offer only a rule of thumb for choosing the sizes 
of the first stage tests, This practical approach has been 
successful in the cases examined thus far. The rule in- 
volves use of non-randomized values for @-only, and only 
assures a "near optimal" vector Oy = (A, to vee ot) 
where by “near optimal" we mean that only minor improvement 
in the value of tm’ (say in the second or third decimal 
place) would accrue if the true optimal vector & were 
found. 

The first step involves finding an upper bound for 
m™', Consider the result @,=0,V=1,2,...,m-1, gos 
Which represents the most "informative" result obtainable. 
Since such a result reduces to the identical first stage 
tests situation, we would be testing the second parameter 


of a binomial distribution, that is 


Ho: P=Q,, against Hy: Diteall xs 
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The best choice of (osm) can be obtained for that case 
by enumerating the powers 77' obtained from each of the 
available (a,,7.,) pairs, using tables of the binomial 
distribution. Let RS. be the highest such power obtained 
for the desired size a‘, 

Now, using the computer code of Appendix B, we may 


e 
compute TT, for the vector a, (14 ho veer 9G 5) where 


m 


On » W=1,2,..-,m denotes tne largest non-zero, non- 
a 


randomized value of a, less than or equal to 0.5. Fora 
two stage test of size .05, if “ge is within (say) .02 of 


Ty's it seems reasonable to accept a, as 'near optimal" 


Since any other choice of a Will improve tr by .02 at most. 
ata uo is MeUewitoin 0c. of T's there is cause to believe 


it is not "near optimal" and a new vector @ should be 


b 
is given, but 


_ 
Mb 
the following observations, made from the experience of the 


tried. No firm suggestion for choosing 


case study to follow, are given as guidance. 

Components of a for which @, is large seem 
to be most sensitive (that is, a change of one "step" 
in available a, values causes a relatively large 
change int’). Components of a for which g. is small 
(say 1) seem quite insensitive. Thus in seeking the 
optimal a& it would seem advisable to change the a+'s 
for which §., is large first. 

2. It appears that each component may be "near- 
optimized" separately. That is, if or is the largest 


O+value, then one may find the best value of Oy for 
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arbitrary values of the other ay's. Having found a 


"near optimal” value for ao» say a then one can 


qo’ 


find a "near optimal" value for a say Ano? in the 


p? 
vector Ch OU es os Segal *)) where a+ is arbitrary 
for all Y except p and q and @, is the next largest 

@. This procedure may be continued until all compo- 
nents of & have been "near-optimized". Due to inter- 
actions between components of a the resulting vector 


may not be optimal but the case study experience 


provides strong evidence that it will be near optimal. 


Wnile the foregoing seems to suggest a lengthy procedure, 
in a very limited set of test cases considered, the initial 
vector 0, was "near optimal" and no further work was 
necessary. Thus it may be that the choice of “a is nota 


problem in practice. 


A CASE STUDY 

Our prototype experiment was based on an experiment 
conducted at the US Naval Postgraduate School in the spring 
of 1968. There were n=21 subjects, each of whom made 8 
attempts to detect actual targets and 8 attempts to detect 
non-existent targets (m=8). The value D= 3 was used in the 
specification of Hy» making the hypotheses under test 


ie 
i 
Hea cL i against H;: 4@,-f,==—~7j77 Vi. 
0 1 sa . : » t Ler, J sail 


The experimental data is given in Table 2, 


aa 


TABLE < 


Results of a Prototype Experiment 


Subject Indicated Tie Subject Indicated True 

Number Detections Detections Number Detections Detections 
(1) (1, ) 6» i? eee 5 (i,) 
ib BZ 5 a Z iL 
é ? 5 12 3 2 
3 10 2 13 10 5 
uf 10 4 14 11 4 
> ( 5 ie ( 3 
6 9 5 16 14 8 
if lid fi ie tz 6 
8 ‘ii 5 18 10 6 
7 14 8 J ff 3 
10 i 2 20 ti 3 

2a. 9 4 
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—_ 
A near optimal vector = (A, sh,s000 90 Was arrived 


3) 
at using the procedure discussed previously. It may be 
informative to reconstruct that procedure at this point. 
The (hypothetical) case #,=0, YH 15250005758, = 21 was used 
to get an upper bound To on the power for size a’'=.05. 
The various (ay, Try) pairs available for each group 7, 
VY=1525-..0,0 were obtained using the computer code of 
Appendix A, and are reprOduced in Table 3, along with a 
listing of the total responses (1, *s) which fell into each 
given group Y according to the rule given earlier (group ” 
contains all results 1,=Yor 1, =m -Y¥) and the critical 
value for conducting the (first stage) test at the listed 
size a when the total response is 1, Tee was found to 

be .994 and the first trial vector a, was (2 505s23gR50n "25. 
505, 308s 504031). Use of a. yielded n'= .98401 which is 
near optimal. The first stage tests were then applied at 
the sizes in @. with results as shown in Table 4. fThe 
vector Y= (¥sYoyoees¥g) = (052515010915350), which may be 
read directly as the right hand column of Table 4, was 
found to be in the acceptance region and Hy Was accepted 

at the .05 level of significance, Note that the computer 
output from the trial computation foro, includes a complete 
description of the acceptance region, so further compu- 


tation was not necessary. 


ADDITIONAL RESULTS 
Although not mentioned in the section on choosing a 


near optimal first stage size vector a, there is evidence 


1 


TABLE 3 


Available (a., n~) Pairs for the Prototype Experiment 


Group Number ( Canaan} ua) Sciurvcal Value 1c 
ere) aS Reject for J, >C 
bagia TEL 


1 
3 
3 


(2 0neCs) ; 
| CAOmes as 3 


(320.069 


-04,.22 
CEO asic: 


(aly, . 50) 
CAO ee) 





(.30,.74) 
(300%. 33) 
(7003, .05) 
(.50,.88) 
Pees 5)) 
(mOz so) 











PF fF FP FP BF Pe 
DOO WO OFRMNPUOPNO DIED & 
yoo™N On WN ERI NOAH & EWP OW EW FE WWN NW 


ales eye 
re) 8 
(.07,.36) 8 
8 
8 


O AJLO NINO NNO NN) 





(OCS a107) 
(.0001,.003) 
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TABLE 4 


Results of First Stage Tests 


Group Number Number of First Size Used to Number of 
(1) Stage Tests in Perform Tests Rejections 
Group (9+) (&,,) ) 
iL 1 o 50 0 
é 3 n25 2 
3 i o 50 1 
Ly 3 226 0 
5 1 0 0 
6 4 o 30 1 
7 8 0 50 3 
8 0 ap 0 


that a rather large number of near optimal vectors (a) 
exist. In the case study just presented, an extended in- 
vestigation revealed that there are at least 24 near optimal 
vectors for a test of size .05. (Indeed there may be more 
than 24, since the investigation did not enumerate all 
possible vectors %.) Table 5 presents a resume of the near 
optimal vectors found. It is interesting that at least 20 
first stage size vectors yield powers which differ by less 
than .004, Additionally, the investigation disclosed at 
least 26 more vectors which produce powers greater than .974 


and may be considered "near-optimal", 


ull 


A List of Near Optimal First Stage Size Vectors 


TABLE 5 


a, = 650; Oy = 0f3s and Ae= el in all vectors. 





Note; 

the subvector (3 9A), 90 

Subvector iil 

(2507628, . 140 oy 6 5 5 
(.. 50mm 2s 5. 50g 0, eee me Joo 
(.50 53265. 5052 30S gee 1 
(,505.¢°8,.145..30, em oo 91 
(, 103828, . 50,5 30Reeo) . 9eee2 
(.109928,.14, 360, sae) . . 9860 
(.10,.28,.50,. 350, <One. Joes 
(LOGN28 7 140 FOr 928s 
(.90,522852505. 305510)" .9eze4 
(250,228,766, 0,.16) <9eaee 
(.90,.28, say, 290), a0). ORO 
(290; 2850505 0805s 50) eee 205 
(es SOpe Come C6, 30 ywe50) «96m 0 
(90, . 28m, 0, . 50) mee ae 
(3505 2G OL Pee 1 Caer ome? 
(750,.28,.01, .S08. 50) aeenst 


5 


Oe 
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~97829 
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©97743 
97741 
-97738 


If we consider an experidment somewhat smaller than 
in the preceeding case study, it is possible to list all 
possible first stage size vectors (easily) and compute the 
resulting powers 1m’. Let us consider the case n=13, m= 8, 
and suppose the total responses ae were such that 
CAS Osa wlyy 25354, 8, G.=1, B.=4, and po In this 
instance the computer code of Appendix 2 produced values of 
T™' for all possible first stage size vectors in less than 
1 minute on the IBM 360 computer. The optimal vector @, 
yielded a power of .949. For small experiments, or when 
less than 5 of the @'s are non zero, this complete enumer- 
ation technique is recommended, For other cases, the number 
of possible vectors (a's) is generally rather large and 
computation time grows rapidly. In the case study presented, 
a complete enumeration would have required more than 4 hours 
of computer time, which is prohibitive for most users, 
especially in view of the small improvement in TT’ which may 


result. 


COMMENTS ON POWER 

In formulating the two stage test, we have weakened 
some of the assumptions required for the earlier techniques 
considered in Chapter 1. We would therefore expect some 
reduction in power compared to the tests which take advantage 
the stronger assumptions. In order to give some clarifi- 
cation of this point, a numerical example is now given. 


Suppose the experiment discussed in the case study had been 
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analyzed under the additional assumption that all individuals 
perform equally. Under that assumption, the total number of 
os | 


true detections 4 Jj =97 and the total number of false 
Za — 
alarms 2X Jb) eeROIE. would have been considered as obser- 
i 
vations on the random variables J' and K' where g' 2 b(168,4) 
ana K' £ (168,f). Typically, the test of Ho: d=f against 
: Ca would be made, Using one of the usual 
Peel ae 
asymtotic techniques, with size .05 results in a power of 


H 


999 at the point (d=.75, f=.5) a typical poimt on the 
curve used in the alternate hypothesis for the case study. 
Comparing .999 with .984, the optimal power for the two 
stage test in the case study, we see that relaxation of the 


assumptions d, =d Vi, and f =f Vi has "cost" .015 in 


1 
power. While it is for the individual experimenter to 
decide on the merits of this “trade off", it seems that aim 
general, the loss of power is slight in light of the much 
broader applicability of the two stage test. Additionally, 


at the point (d=.9, f=.75) which is also on the curve 
1-f 


Ein 
whereas the two stage test continues to have power of .984. 


d-f= the asymtotic technique yields power of .980 
At this more extreme point, the two stage test is better 
than the asymtotic test under the stronger assumptions. 

It may also be important to set a lower bound on the 
power of a two stage test for a given experiment before it 
is run. The poorest power would emerge if all results fell 
in group 1, that is, if L, was 1 orme-1 for all i. How- 


ever, the result Q,=n is so unlikely that it is felt the 
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hypothetical result B= or even o.=n Can be used to give 
a practical lower bound for power which is still rather 
conservative (perhaps even pessimistic). 

In the case study, if 2. Was 21, the test HH; pHa, 


0 


against H p=T, would be used to determine the lower 


1° Zz 
bound on 77', where (A575) is the size - power pair which 
gives the best power at size .05. Completing this procedure 
fer tne case example yields .91 as\ practical lower bound on 
the power of the two stage test. Of course, the actual 


results were much more "informative" tnan =n, and the 


actual power achieved was .984. 
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CHAPTER V 
A RAPID APPROXIMATE PROCEDURE 


If we strengthen assumption i (p. 11) by adding the 
statement "m is sufficiently large to assure that the arcsine 


Cag? 


transformation provides reasonably ‘normal’ random 
variates when applied to tne experimental data", (leaving 
all other assumptions as stated), we may then use an approx- 


imate procedure to test the hypothesis 


Ho} d,=f, Yi against A, qd, #f, Yi. 


Only a brief derivation will be given and we make no 
comments about the properties of the procedure except to 
remark that it is easy and rapid in application as compared 
to the two stage test, and in all test cases thus far 
examined, has provided the same results (in terms of re- 


jection of H)) as the two stage test. 


DERIVATION OF THE PROCEDURE 
We shall use the terminology of Chapter 2, Under the 
strengthened version of assumption i noted above, we have 


(approximately ) 


‘tl 


arcsin YJ, /m 


arcsin 4K, /m 


N(arcsin v4, , 1f/4m) Wi, and 


if 


N(arcsin Vf, , 1/4m) Wi. 
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From the well known properties of the normal distribution 


d 

2m aresin ¥ J, /m = Nan arcsin{d, cael) Yi, and 
d 

2m arcesin JK, /m = Mie ezin arcsin {f,, 1/2) eve 


parther, for each 1,°=1142 ween 


: d 
2m(arcsin (J, /m - arcsin 1K, /m) =i 2m [aresinfd, - arcsin{f, }1). 
Now under Hos ead, Yi, we have 
d 
2m (aresin YJ, /m - arcsin+ K, /m) = N(0,1) Vow 
Denoting 2m (aresin J, /a, - arcsin K, /m ) by Z, 
we have 


2 arnj2 , 
(Z,) = X3 1 4= 1,2 tiowewls SOULiae 


S 
N 
atN 
Tres 
~~ 
Nh 


5 


i | 


In order to test H,: d.=f, Wi against H,: 4, 4 f Wet at 
0 ai rh . ul af 


=) 


| | | : ee 
size a, we may reject H, if 2 (2, ) cts He 


i=1 


i 4 
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APPENDIX A 


A FORTRAN CODE FOR COMPUTING OPERATING CHARACTERISTIC 
CURVES FOR FISHER'S EXACT TEST OF PERCENTAGES 


The code which follows evaluates equation 19 of 
Chapter 2 appropriately to form the complementary cumulative 
G@astribution functions of J, 1, hor Teac Los 1,=1,¢,...,-m 
under both the null and alternate hypothesis. The input 
parameters PD, PF, and M correspond respectively, to ds» 
fi? and m in equation 19. The values of PD and PF used as 
input to the code can be any point on the altermate hypothesis 
curve qd. -f, =e, -£,)/] (f, (D- ry)? oo (The program 
is useful, of course, for any investigation of Fisher's 
Test whether or not related to the specific use intended 
here. ) 

A subroutine format is used to allow maximum flexi- 
bility in the application of the code. Depending on the 
computer used, this subroutine should perform for values of 


m up to at least 40. ‘The output includes a full definition 


of parameters and variables and is self-explanatory. 
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APPENDIX B 


A FORTRAN CODE FOR DETERMINING THE ACCEPTANCE REGION 


FOR THE SHCOND STAGE TEST IN THE TWO STAGE PROCEDURE 


In general, this program can be used to generate the 
CDF’s for two joint distributions of up to 12 independent 
binomial distributions with first parameters given by 
(KPHI(1), KPHI(2),...,KPHI(12)) in both cases and second 
parameters given by (ALFHA(1),ALPHA(2),...,ALPHA(12)) for 
the first joint distribution and by (XPI(1), XPi(2)yee 
XPI(12)) for the second. 

The program input parameters which relate to the 


prototype case Variables are: 


KPHI(I) - @, Number of first stage tests in the iv 
group. 

ALPHA(I) - a, Size of first stage tests in the mae 
group. | 

a) = Cee ote of first stage tests in the i° 
ELOUP » 

M -~ k Number of groups which contain at least 


one test. (If desired, all groups, in- 
cluding those for which = 0 can be read 
in to the program but this practice 
inemeneee computing time. If this prac- 
tice is followed, then M corresponds 


CO. 
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Other program input parameters are; 
START The largest size for which a printed value of 
a* and tm’ is desired. 

SIZE The smallest size for which a printed value of 
a* and m1’ is desired. 

KP The Kind of print parameter, 
KP=0QO provides a complete printout of sizes and 
corresponding powers for all sizes between SIZE 
and START plus a summarized description of the 
acceptance region for a test of size SIZE. 
KP=-1 provides only the print out of sizes and 
corresponding powers for ail sizes between SIZE 
and START. 
KP=1 provides only the summarized description 
of the acceptance region for a test of size SIZE 
plus a statement of the power of that test. 
NOTE: Regardless of the KP parameter specified, 
the output also alWays includes a recap of the 


@.*s, 4 's, and 1,'s for verification purposes. 


al i: a 
In addition to reading in the above parameters, it is 
also necessary to insure that the dimension of KSUM, XLAM, 
and KX is greater than or equal to v, which can be computed 
by the formula v= 77 (+1), With KSUM, XLAM, and KX so 
dimensioned, the ee Will operate on any experiment for 


Which the number of subjects (n) is less than or equal to 


30 and for which k, tne number of non-zero g,'s is less 


than or equal to ie. The program is readily extended to 
larger problems but experience indicates that such cases 
Cause excessive computing time and/or exceed computer 


storage capacity. 
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